IVN-Based Joint Training Of GMM And HMMs Using An Improved VTS-Based Feature Compensation For Noisy Speech Recognition
نویسندگان
چکیده
In our previous work, we proposed a feature compensation approach using high-order vector Taylor series approximation for noisy speech recognition. In this paper, first we improve the feature compensation in both efficiency and accuracy by boosted mixture learning of GMM, applying higher order information of VTS approximation only to the noisy speech mean parameters, acoustic context expansion, and modeling the convolutional distortion as a single Gaussian. Then we design a procedure to perform irrelevant variability normalization based joint training of GMM and HMM using the improved VTS-based feature compensation. The effectiveness of our proposed approach is confirmed by experiments on Aurora3 tasks.
منابع مشابه
Irrelevant variability normalization based HMM training using VTS approximation of an explicit model of environmental distortions
In a traditional HMM compensation approach to robust speech recognition that uses Vector Taylor Series (VTS) approximation of an explicit model of environmental distortions, the set of generic HMMs are typically trained from “clean” speech only. In this paper, we present a maximum likelihood approach to training generic HMMs from both “clean” and “corrupted” speech based on the concept of irrel...
متن کاملروشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه
Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...
متن کاملPersian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods
Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...
متن کاملAn effective feature compensation scheme tightly matched with speech recognizer employing SVM-based GMM generation
This paper proposes an effective feature compensation scheme to address a real-life situation where clean speech database is not available for Gaussian Mixture Model (GMM) training for a model-based feature compensation method. The proposed scheme employs a Support Vector Machine (SVM)based model selection method to effectively generate the GMM for our feature compensation method directly from ...
متن کاملImproving Phoneme Sequence Recognition using Phoneme Duration Information in DNN-HSMM
Improving phoneme recognition has attracted the attention of many researchers due to its applications in various fields of speech processing. Recent research achievements show that using deep neural network (DNN) in speech recognition systems significantly improves the performance of these systems. There are two phases in DNN-based phoneme recognition systems including training and testing. Mos...
متن کامل